Adapting the Unisyn Lexicon to Portuguese: Preliminary issues in the development of LUPo

نویسندگان

  • Simone Ashby
  • José Pedro Ferreira
  • Sílvia Barbosa
چکیده

This paper presents some preliminary issues and proposed solutions in the development of an accent-independent pronunciation lexicon for Portuguese, known as the Portuguese Unisyn Lexicon (LUPo). LUPo's objectives are presented within the context of the Portal da Língua Portuguesa knowledge base. Key considerations are addressed for encoding morphological boundaries, treating orthographic forms, and handling loan words. Here, it is argued that the knowledge-driven paradigm exemplified in the original English Unisyn Lexicon, along with the Portal da Língua Portuguesa's relational structure and rich lexicographic content present a good foundation for establishing a tightly integrated and well informed system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Rule Based Pronunciation Generator and Regional Accent Databank for Portuguese

One of the major obstacles in deploying spoken language technologies (SLTs) in the developing world is a lack of key linguistic resources – e.g. electronic dictionaries, phonetically aligned corpora, pronunciation lexicons, etc. – that describe the non-dominant varieties spoken in such countries and regions. In this paper, we describe the work of the LUPo (Portuguese Unisyn Lexicon) project to ...

متن کامل

The Role of Morphology in Generating High-Quality Pronunciation Lexica for Regional Variants of Portuguese

Grapheme to phoneme (GTP) systems for languages such as English, German, and Korean have been shown to achieve better performance rates with the inclusion of a morpho-phonological preprocessing component. While semiautomatic and automatic GTP approaches for Portuguese continue to achieve steady gains, such algorithms do not take morphology into account, despite a growing need to do so, based in...

متن کامل

Models of EFL Learners’ Vocabulary Development: Spreading Activation vs. Hierarchical Network Model

Semantic network approaches view organization or representation of internal lexicon in the form of either spreading or hierarchical system identified, respectively, as Spreading Activation Model (SAM) and Hi- erarchical Network Model (HNM). However, the validity of either model is amongst the intact issues in the literature which can be studied through basing the instruction compatible wi...

متن کامل

Mainland Chinese Students’ Shifting Perceptions of Chinese-English Code-Mixing in Macao

As a former Portuguese colony, Macao is the only region in China where Cantonese, a variety of Chinese, and English, an international language, are enjoying de facto official statuses, with Putonghua being a quasi-official language and Portuguese being another official language. Recently, with an increasing number of Mainland Chinese students crossing the border to pursue their tertiar...

متن کامل

The Keyword Lexicon - An accent-independent lexicon for automatic speech recognition

Recent work at the Centre for Speech Technology Research (CSTR) at the University of Edinburgh has developed an accent-independent lexicon for speech synthesis (the Unisyn project). The main purpose of this lexicon is to avoid the problems and cost of writing a new lexicon for every new accent needed for synthesis. Only recently [1], a first attempt has been made to use the Keyword Lexicon for ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010